Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications
نویسندگان
چکیده
The performance optimization of scientific applications usually requires an in-depth knowledge of the hardware and software. A performance tuning mechanism is suggested to automatically tune OpenACC parameters to adapt to the execution environment on a given system. A historic learning based methodology is suggested to prune the parameter search space for a more efficient auto-tuning process. This approach is used to tune the OpenACC gang and vector clauses for a better mapping of the compute kernels onto the underlying architecture. Our experiments show a significant performance improvement against the default compiler parameters and drastic reduction in tuning time compared to a brute force search-based approach.
منابع مشابه
ACC-SVM: Accelerating SVM on GPUs using OpenACC
GPUs have been successfully applied in scientific computing in the last decade. Many machine learning algorithms have also used GPUs to accelerate their computations. This includes the Support Vector Machine (SVM) which is a classical machine learning algorithm that has been successfully used in many applications such as text classification and image recognition. There have been many open-sourc...
متن کاملCompiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. As such, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and ma...
متن کاملCollective mind: Towards practical and collaborative auto-tuning
Empirical auto-tuning and machine learning techniques have been showing high potential to improve execution time, power consumption, code size, reliability and other important metrics of various applications for more than two decades. However, they are still far from widespread production use due to lack of native support for auto-tuning in an ever changing and complex software and hardware sta...
متن کاملA Feasibility Study on Porting the Community Land Model onto Accelerators Using Openacc
As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we are facing enormous challenges regarding to porting those applications onto hybrid computing architecture. OpenACC emerges as a very promising technology, therefore, we have conducted a...
متن کاملPerformance and Portability of Accelerated Lattice Boltzmann Applications with OpenACC
An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as accelerators could usually be programmed using specific programming languages threatening maintainability, portability and correctness. Several new programmi...
متن کامل